Description of the Odin Event Extraction Framework and Rule Language
نویسندگان
چکیده
This document describes the Odin framework, which is a domain-independent platform for developing rule-based event extraction models. Odin aims to be powerful (the rule language allows the modeling of complex syntactic structures) and robust (to recover from syntactic parsing errors, syntactic patterns can be freely mixed with surface, token-based patterns), while remaining simple (some domain grammars can be up and running in minutes), and fast (Odin processes over 100 sentences/second in a real-world domain with over 200 rules). Here we include a thorough definition of the Odin rule language, together with a description of the Odin API in the Scala language, which allows one to apply these rules to arbitrary texts. 1 ar X iv :1 50 9. 07 51 3v 1 [ cs .C L ] 2 4 Se p 20 15
منابع مشابه
A Domain-independent Rule-based Framework for Event Extraction
We describe the design, development, and API of ODIN (Open Domain INformer), a domainindependent, rule-based event extraction (EE) framework. The proposed EE approach is: simple (most events are captured with simple lexico-syntactic patterns), powerful (the language can capture complex constructs, such as events taking other events as arguments, and regular expressions over syntactic graphs), r...
متن کاملOdin's Runes: A Rule Language for Information Extraction
Odin is an information extraction framework that applies cascades of finite state automata over both surface text and syntactic dependency graphs. Support for syntactic patterns allow us to concisely define relations that are otherwise difficult to express in languages such as Common Pattern Specification Language (CPSL), which are currently limited to shallow linguistic features. The interacti...
متن کاملApplying Web Semantics to Evaluate Biomedical Knowledge Graphs
Knowledge graphs (KG) are being used extensively for data driven applications. Due to their large scale and heterogeneity, KGs are often constructed using automated IE toolkits. Owing to the diverse nature of the sources, such extractions are often noisy and contain many semantic inaccuracies. In domains such as life sciences, having high quality and consistent KGs are very important to improve...
متن کاملOntology-guided extraction of structured information from unstructured text: Identifying and capturing complex relationships
Many applications call for methods to enable automatic extraction of structured information from unstructured natural language text. Due to the inherent challenges of natural language processing, most of the existing methods for information extraction from text tend to be domain specific. This thesis explores a modular ontology-based approach to information extraction that decouples domain-spec...
متن کاملSecond Workshop on Natural Language Processing and Linked Open Data
We describe a system for event extraction across documents and languages. We developed a framework for the interoperable semantic interpretation of mentions of events, participants, locations and time, as well as the relations between them. Furthermore, we use a common RDF model to represent instances of events and normalised entities and dates. We convert multiple mentions of the same event in...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1509.07513 شماره
صفحات -
تاریخ انتشار 2015